22 research outputs found
Active Nearest-Neighbor Learning in Metric Spaces
We propose a pool-based non-parametric active learning algorithm for general
metric spaces, called MArgin Regularized Metric Active Nearest Neighbor
(MARMANN), which outputs a nearest-neighbor classifier. We give prediction
error guarantees that depend on the noisy-margin properties of the input
sample, and are competitive with those obtained by previously proposed passive
learners. We prove that the label complexity of MARMANN is significantly lower
than that of any passive learner with similar error guarantees. MARMANN is
based on a generalized sample compression scheme, and a new label-efficient
active model-selection procedure
Efficient Learning of Linear Separators under Bounded Noise
We study the learnability of linear separators in in the presence of
bounded (a.k.a Massart) noise. This is a realistic generalization of the random
classification noise model, where the adversary can flip each example with
probability . We provide the first polynomial time algorithm
that can learn linear separators to arbitrarily small excess error in this
noise model under the uniform distribution over the unit ball in , for
some constant value of . While widely studied in the statistical learning
theory community in the context of getting faster convergence rates,
computationally efficient algorithms in this model had remained elusive. Our
work provides the first evidence that one can indeed design algorithms
achieving arbitrarily small excess error in polynomial time under this
realistic noise model and thus opens up a new and exciting line of research.
We additionally provide lower bounds showing that popular algorithms such as
hinge loss minimization and averaging cannot lead to arbitrarily small excess
error under Massart noise, even under the uniform distribution. Our work
instead, makes use of a margin based technique developed in the context of
active learning. As a result, our algorithm is also an active learning
algorithm with label complexity that is only a logarithmic the desired excess
error
Advances in Neural Information Processing Systems
Better understanding of the potential benefits of information transfer and representation learning is an important step towards the goal of building intelligent systems that are able to persist in the world and learn over time. In this work, we consider a setting where the learner encounters a stream of tasks but is able to retain only limited information from each encountered task, such as a learned predictor. In contrast to most previous works analyzing this scenario, we do not make any distributional assumptions on the task generating process. Instead, we formulate a complexity measure that captures the diversity of the observed tasks. We provide a lifelong learning algorithm with error guarantees for every observed task (rather than on average). We show sample complexity reductions in comparison to solving every task in isolation in terms of our task complexity measure. Further, our algorithmic framework can naturally be viewed as learning a representation from encountered tasks with a neural network
Adversarially Robust Learning with Tolerance
We initiate the study of tolerant adversarial PAC-learning with respect to
metric perturbation sets. In adversarial PAC-learning, an adversary is allowed
to replace a test point with an arbitrary point in a closed ball of radius
centered at . In the tolerant version, the error of the learner is
compared with the best achievable error with respect to a slightly larger
perturbation radius . This simple tweak helps us bridge the gap
between theory and practice and obtain the first PAC-type guarantees for
algorithmic techniques that are popular in practice.
Our first result concerns the widely-used ``perturb-and-smooth'' approach for
adversarial learning. For perturbation sets with doubling dimension , we
show that a variant of these approaches PAC-learns any hypothesis class
with VC-dimension in the -tolerant adversarial
setting with samples.
This is in contrast to the traditional (non-tolerant) setting in which, as we
show, the perturb-and-smooth approach can provably fail.
Our second result shows that one can PAC-learn the same class using
samples
even in the agnostic setting. This result is based on a novel compression-based
algorithm, and achieves a linear dependence on the doubling dimension as well
as the VC-dimension. This is in contrast to the non-tolerant setting where
there is no known sample complexity upper bound that depend polynomially on the
VC-dimension.Comment: The paper was accepted for ALT 202
Learning with non-Standard Supervision
Machine learning has enjoyed astounding practical
success in a wide range of applications in recent
years-practical success that often hurries ahead of our
theoretical understanding. The standard framework for machine
learning theory assumes full supervision, that is, training data
consists of correctly labeled iid examples from the same task
that the learned classifier is supposed to be applied to.
However, many practical applications successfully make use of
the sheer abundance of data that is currently produced. Such
data may not be labeled or may be collected from various
sources.
The focus of this thesis is to provide theoretical analysis of
machine learning regimes where the learner is given such
(possibly large amounts) of non-perfect training data. In
particular, we investigate the benefits and limitations of
learning with unlabeled data in semi-supervised learning and
active learning as well as benefits and limitations of learning
from data that has been generated by a task that is different
from the target task (domain adaptation learning).
For all three settings, we propose
Probabilistic Lipschitzness to model the relatedness between the labels and the underlying domain space, and we
discuss our suggested notion by comparing it to other common
data assumptions
Generative Multiple-Instance Learning Models For Quantitative Electromyography
We present a comprehensive study of the use of generative modeling approaches
for Multiple-Instance Learning (MIL) problems. In MIL a learner receives
training instances grouped together into bags with labels for the bags only
(which might not be correct for the comprised instances). Our work was
motivated by the task of facilitating the diagnosis of neuromuscular disorders
using sets of motor unit potential trains (MUPTs) detected within a muscle
which can be cast as a MIL problem. Our approach leads to a state-of-the-art
solution to the problem of muscle classification. By introducing and analyzing
generative models for MIL in a general framework and examining a variety of
model structures and components, our work also serves as a methodological guide
to modelling MIL tasks. We evaluate our proposed methods both on MUPT datasets
and on the MUSK1 dataset, one of the most widely used benchmarks for MIL.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty
in Artificial Intelligence (UAI2013
When can unlabeled data improve the learning rate?
Göpfert C, Ben-David S, Bousquet O, Gelly S, Tolstikhin I, Urner R. When can unlabeled data improve the learning rate? In: Conference on Learning Theory (COLT). 2019